13 research outputs found

    Parallelizing the Chambolle Algorithm for Performance-Optimized Mapping on FPGA Devices

    Get PDF
    The performance and the efficiency of recent computing platforms have been deeply influenced by the widespread adoption of hardware accelerators, such as Graphics Processing Units (GPUs) or Field Programmable Gate Arrays (FPGAs), which are often employed to support the tasks of General Purpose Processors (GPP). One of the main advantages of these accelerators over their sequential counterparts (GPPs) is their ability of performing massive parallel computation. However, in order to exploit this competitive edge, it is necessary to extract the parallelism from the target algorithm to be executed, which is in general a very challenging task. This concept is demonstrated, for instance, by the poor performance achieved on relevant multimedia algorithms, such as Chambolle, which is a well-known algorithm employed for the optical flow estimation. The implementations of this algorithm that can be found in the state of the art are generally based on GPUs, but barely improve the performance that can be obtained with a powerful GPP. In this paper, we propose a novel approach to extract the parallelism from computation-intensive multimedia algorithms, which includes an analysis of their dependency schema and an assessment of their data reuse. We then perform a thorough analysis of the Chambolle algorithm, providing a formal proof of its inner data dependencies and locality properties. Then, we exploit the considerations drawn from this analysis by proposing an architectural template that takes advantage of the fine-grained parallelism of FPGA devices. Moreover, since the proposed template can be instantiated with different parameters, we also propose a design metric, the expansion rate, to help the designer in the estimation of the efficiency and performance of the different instances, making it possible to select the right one before the implementation phase. We finally show, by means of experimental results, how the proposed analysis and parallelization approach leads to the design of efficient and high-performance FPGA-based implementations that are orders of magnitude faster than the state-of-the-art ones

    A Work-Related Musculoskeletal Disorders (WMSDs) Risk-Assessment System Using a Single-View Pose Estimation Model

    No full text
    Musculoskeletal disorders are an unavoidable occupational health problem. In particular, workers who perform repetitive tasks onsite in the manufacturing industry suffer from musculoskeletal problems. In this paper, we propose a system that evaluates the posture of workers in the manufacturing industry with single-view 3D human pose-estimation that can estimate the posture in 3D using an RGB camera that can easily acquire the posture of a worker in a complex workplace. The proposed system builds a Duckyang-Auto Worker Health Safety Environment (DyWHSE), a manufacturing-industry-specific dataset, to estimate the wrist pose evaluated by the Rapid Limb Upper Assessment (RULA). Additionally, we evaluate the quality of the built DyWHSE dataset using the Human3.6M dataset, and the applicability of the proposed system is verified by comparing it with the evaluation results of the experts. The proposed system provides quantitative assessment guidance for working posture risk assessment, assisting the continuous posture assessment of workers

    Toward automatic vocal tract area function estimation from accelerated three-dimensional magnetic resonance imaging

    No full text
    Vocal tract area function estimation from three-dimensional (3D) volumetric dataset often involves complex and manual procedures such as oblique slice cutting and image segmentation. We introduce a semi-automatic method for estimating vocal tract area function from 3D Magnetic Resonance Imaging (MRI) datasets. The method was implemented on a custom MATLAB graphical user interface and computes the area function in a user-interactive way. The 3D MRI datasets were acquired with 1.25 mm isotropic resolution during 8-seconds sustained sound productions of vowels /IY/, /AA/, /UW/ by one male native speaker of American English at a 3 Tesla MRI scanner.4 page(s

    Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC)

    No full text
    USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community.5 page(s
    corecore